Fast algorithms for large-scale genome alignment and comparison.

نویسندگان

Arthur L Delcher

Adam Phillippy

Jane Carlton

Steven L Salzberg

چکیده

We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system. It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Computational methods for Multiple Genome Alignment and Synteny detection

Multiple genome alignments are useful to detect synteny, gene order, and large-scale genomic re-arrangements which help to understand genome evolution, divergence and the development of protein functions. However, aligning multiple whole genomes is very computationally intensive [3] and many genomes are only partially complete. Fast approximation algorithms have been developed to handle both th...

متن کامل

SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments

MOTIVATION The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comp...

متن کامل

Murasaki: A Fast, Parallelizable Algorithm to Find Anchors from Multiple Genomes

BACKGROUND With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, ...

متن کامل

Effect of Objective Function on the Optimization of Highway Vertical Alignment by Means of Metaheuristic Algorithms

The main purpose of this work is the comparison of several objective functions for optimization of the vertical alignment. To this end, after formulation of optimum vertical alignment problem based on different constraints, the objective function was considered as four forms including: 1) the sum of the absolute value of variance between the vertical alignment and the existing ground; 2) the su...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Nucleic acids research

دوره 30 11 شماره

صفحات -

تاریخ انتشار 2002

Fast algorithms for large-scale genome alignment and comparison.

نویسندگان

چکیده

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Computational methods for Multiple Genome Alignment and Synteny detection

SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments

Murasaki: A Fast, Parallelizable Algorithm to Find Anchors from Multiple Genomes

Effect of Objective Function on the Optimization of Highway Vertical Alignment by Means of Metaheuristic Algorithms

عنوان ژورنال:

اشتراک گذاری